GraphSR: A Data Augmentation Algorithm for Imbalanced Node Classification

نویسندگان

چکیده

Graph neural networks (GNNs) have achieved great success in node classification tasks. However, existing GNNs naturally bias towards the majority classes with more labelled data and ignore those minority relatively few ones. The traditional techniques often resort over-sampling methods, but they may cause overfitting problem. More recently, some works propose to synthesize additional nodes for from nodes, however, there is no any guarantee if generated really stand corresponding classes. In fact, improperly synthesized result insufficient generalization of algorithm. To resolve problem, this paper we seek automatically augment massive unlabelled graph. Specifically, \textit{GraphSR}, a novel self-training strategy significant diversity which based on Similarity-based selection module Reinforcement Learning(RL) module. first finds subset are most similar second one further determines representative reliable via RL technique. Furthermore, RL-based can adaptively determine sampling scale according current training data. This general be easily combined different models. Our experiments demonstrate proposed approach outperforms state-of-the-art baselines various class-imbalanced datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Mining Fuzzy Classification Rules for Imbalanced Data

Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...

متن کامل

On Mining Fuzzy Classification Rules for Imbalanced Data

متن کامل

Error back-propagation algorithm for classification of imbalanced data

Classification of imbalanced data is pervasive but it is a difficult problem to solve. In order to improve the classification of imbalanced data, this letter proposes a new error function for the error backpropagation algorithm of multilayer perceptrons. The error function intensifies weight-updating for the minority class and weakens weight-updating for the majority class. We verify the effect...

متن کامل

Intelligent Rule Mining Algorithm for Classification over Imbalanced Data

Association rule mining for classification is a data mining technique for finding informative patterns from large datasets. Output is in the form of if-then rules containing attribute value combinations in antecedent and class label in the consequent. This method is popular for classification as rules are simple to understand and allow users to look into the factors leading to a specific class ...

متن کامل

An Improved Algorithm for SVMs Classification of Imbalanced Data Sets

Support Vector Machines (SVMs) have strong theoretical foundations and excellent empirical success in many pattern recognition and data mining applications. However, when induced by imbalanced training sets, where the examples of the target class (minority) are outnumbered by the examples of the non-target class (majority), the performance of SVM classifier is not so successful. In medical diag...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2023

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v37i4.25622